Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
89 views
in Technique[技术] by (71.8m points)

python - Different operations on different columns of a pandas DataFrame

I would like to perform a specific operation for every column of my DataFrame, specifically apply a given operation to all but the last column.

I have done this with google help and it works but seems quite creepy to me.

Can you help me to improve it?

d = {
    'col1': [1, 2, 4, 7], 
    'col2': [3, 4, 9, 1], 
    'col3': [5, 2, 11, 4], 
    'col4': [True, True, False, True]
}
df = pd.DataFrame(data=d)

def do_nothing(x):
    return x

def minor(x):
    return x<2

def multi_func(functions):
    def f(col):
        return functions[col.name](col)
    return f

result = df.apply(multi_func({'col1': minor, 'col2': minor,
                               'col3': minor, 'col4': do_nothing}))

Thank you all

question from:https://stackoverflow.com/questions/66061636/different-operations-on-different-columns-of-a-pandas-dataframe

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Use the aggregate function instead, which allows more options for the func parameter:

res = df.aggregate({'col1': minor, 'col2': minor, 'col3': minor, 'col4': do_nothing})

print(res)

Output (in the context of the script in question):


    col1   col2   col3   col4
0   True  False  False   True
1  False  False  False   True
2  False  False  False  False
3  False   True  False   True

An option to write all this a bit “smarter” is to make the literal 2 a variable and to replace do_nothing by a name that better reflects the way the input is handled:

import pandas as pd
 
d = {
    'col1': [1, 2, 4, 7], 
    'col2': [3, 4, 9, 1], 
    'col3': [5, 2, 11, 4], 
    'col4': [True, True, False, True]
}
df = pd.DataFrame(data=d)

# identity function:
copy = lambda x: x

# lt (less than arg). returns a function that compares to the bound argument:
def lt(arg):
    return lambda x: x < arg

res = df.aggregate({'col1': lt(2), 'col2': lt(2), 'col3': lt(2), 'col4': copy})

print(res)

Same output as above.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...