Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
128 views
in Technique[技术] by (71.8m points)

python - Get numerical values from Dataframe column and add them to separate column

I have a pandas Dataframe as

    position base  text
1   458372   A    19:t|12:cg|7:CG|1:tcag|1:T
2   458373   C    21:GCA|3:GCG|3:ATA|2:GCGAA|1:GTA|1:CGAG|1:g

I would like to retrieve the numbers from the text column and add them up in another column. Values in the text column contain numbers separated by any non-numerical values [^0-9]. In the first row value of the text column, the numbers are 19, 12, 7, 1, and 1, which would add to 40, which would be a value in the new column. The resulting Dataframe would look like:

    position base  text                                          text_sum 
1   458372   A    19:t|12:cg|7:CG|1:tcag|1:T                    40
2   458373   C    21:GCA|3:GCG|3:ATA|2:GCGAA|1:GTA|1:CGAG|1:g   32

Any clues as to how to approach this?

question from:https://stackoverflow.com/questions/65882965/get-numerical-values-from-dataframe-column-and-add-them-to-separate-column

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Use Series.str.extractall for get all numeric, convert to integers and last sum per duplicated index values:

df['text_sum'] = df['text'].str.extractall('(d+)')[0].astype(int).sum(level=0) 
print (df)

   position base                                         text  text_sum
1    458372    A                   19:t|12:cg|7:CG|1:tcag|1:T        40
2    458373    C  21:GCA|3:GCG|3:ATA|2:GCGAA|1:GTA|1:CGAG|1:g        32

Or if possible sum values splitted by | and then before : use:

df['text_sum'] = df['text'].apply(lambda x: sum(int(y.split(':')[0]) for y in x.split('|')))
print (df)
   position base                                         text  text_sum
1    458372    A                   19:t|12:cg|7:CG|1:tcag|1:T        40
2    458373    C  21:GCA|3:GCG|3:ATA|2:GCGAA|1:GTA|1:CGAG|1:g        32

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...