I want to delete DF rows based on a mixed condition. My DF contains these columns:
DF['ID', 'SEQNO','DESCRIP'].
DF will have multiples rows for a given 'ID' but 'ID'+'SEQNO' combination is unique. Now suppose my DF is having this data:
ID SEQNO DESCRIP
A1 1 This is test
A1 2 To check DF
A1 3 XYZ
A1 4 ghj
B1 1 Hello, How are you.
B1 2 XYZ
B1 3 I am Fine
B1 4 Thank You.
B1 5 and you.
Expected Output:
ID SEQNO DESCRIP
A1 1 This is test
A1 2 To check DF
B1 1 Hello, How are you.
Here I want all the rows of an ID where SEQNO < the SEQNO where 'XYZ' has came for a given ID. I tried using the code below, but its very poor, and takes a long time:
rdf = df[df['DESCRIP'].str.contains('XYZ', na=False, case=False )]
for i in range(rdf.shape[0]):
df = df[ ~((df['ID'] == rdf['ID'].iloc[i]) & (df['SEQNO'] >= rdf['SEQNO'].iloc[i]) ) ]
I would appreciate any suggestions on improving the function. Thank you.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…