Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
158 views
in Technique[技术] by (71.8m points)

python - Create new column into dataframe based on values from other columns using apply function onto multiple columns

I am using apply function to create a new column i.e. ERROR_TV_TIC into dataframe based on existing columns [TV_TIC and ERRORS] values. I am not sure what I am doing wrong. With some conditions it works and with another it doesn't and throw error.

DataFrame:

ERRORS|TV_TIC
|2.02101E+41
['Length of Underlying Symbol for Option Contract is exceeding allowed limits(10 chars)']|nan
['Future Option Indicator is missing']|nan
['Trade Id is missing', 'Future Option Indicator is missing']|nan
['Trade Id is missing', 'Future Option Indicator is missing']|nan

Code when it works:

def validate_tv_tic(trades):
    tv_tiv_errors = list() 
    if pd.isnull(trades['TV_TIC']):
        tv_tiv_errors.append("Initial validations passed still TV_TIC missing")
    if pd.notnull(trades['TV_TIC']) and len(trades['TV_TIC']) != 42:
        tv_tiv_errors.append("Initial validations passed and TV_TIC is also generated but length is != 42 chars")
    return tv_tiv_errors if len(tv_tiv_errors) > 0 else np.nan

trades['ERROR_TV_TIC'] = trades.apply(validate_tv_tic, axis=1)

Code when it doesn't work: Here now condition is on 2 columns of series and I am making sure that I am passing "&" and not "and"

def validate_tv_tic(trades):
    tv_tiv_errors = list()
    if pd.isnull(trades['ERRORS']) & pd.isnull(trades['TV_TIC']):
        tv_tiv_errors.append("Initial validations passed still TV_TIC missing")
    if pd.isnull(trades['ERRORS']) & pd.notnull(trades['TV_TIC']) & len(trades['TV_TIC']) != 42:
        tv_tiv_errors.append("Initial validations passed and TV_TIC is also generated but length is != 42 chars")
    return tv_tiv_errors if len(tv_tiv_errors) > 0 else np.nan

trades['ERROR_TV_TIC'] = trades.apply(validate_tv_tic, axis=1)

Error I am getting: ('The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()', 'occurred at index 3')

Error description with used "and" Error Screenshot 2

Error description when used "&" Error Screenshot 2

My gut feeling is saying that pd.isnull is somewhere causing problem but not sure.

question from:https://stackoverflow.com/questions/65870550/create-new-column-into-dataframe-based-on-values-from-other-columns-using-apply

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

There was no problem with code. Problem exists with data inside dataframe.

column ERRORS was list of string and error was thrown when > 1 item exists as column value. So, I was getting error for line 3 and 4

ERRORS

['Length of Underlying Symbol for Option Contract is exceeding allowed limits(10 chars)']
['Future Option Indicator is missing']
['Trade Id is missing', 'Future Option Indicator is missing']
['Trade Id is missing', 'Future Option Indicator is missing']

After finding the root cause I changed the list to string where elements are separated by non-comma element and that works for me.

Changed my return statement of function validate_tv_tiv from

return tv_tiv_errors if len(tv_tiv_errors) > 0 else np.nan

to

return ' & '.join(errors) if len(errors) > 0 else np.nan

and this created my dataframe column ERRORS as below:

ERRORS

Length of Underlying Symbol for Option Contract is exceeding allowed limits(10 chars)
Future Option Indicator is missing
Trade Id is missing & Future Option Indicator is missing
Trade Id is missing & Future Option Indicator is missing

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...