I have a data frame that identifies a type of offense, and the sentence length (i. e. one column specifies the type of offense, while the other gives the length of the sentence). I would need to compare the lengths of sentences between every two sentences (hence, if I had three cases, let's call them A, B, C for offense of type 1 and two cases, let's call them D and E for offense of type 2), I would need to compare A with D and E, B with D and E and C with D and E and find out in how many of these comparisons is the sentence for offense type 1 (cases A, B, C) greater than the sentence for offense type 2 (cases E and D)?
I was thinking of a for loop for every type 1 offense that compares it to every type 2 offense, writing a 1 if it is greater, 0 if it is less, and then dividing it by the number of cases to get the probability of it being greater, but there probably should be a more efficient way.
question from:
https://stackoverflow.com/questions/65882025/how-to-compare-values-based-on-a-classifying-variable 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…