Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
80 views
in Technique[技术] by (71.8m points)

python - Create dataframe of relationships per ID (From, To)

I have a sample dataframe:

data = {'IDs': ['1', '1', '2', '2', '3', '3', '3', '4', '4', '5', '5'],
        'Terms': ["a", "b", "a", "d", "c", "f", "g", "a", "h", "i", "j"],
        'Values': [100, 100, 200, 200, 300, 300, 300, 400, 400, 500, 500]
        }

df = pd.DataFrame(data, columns = ['IDs', 'Terms', 'Values'])

Which creates this:

enter image description here

I want something that looks like the following table, where for each ID, the Terms underneath are mapped to the other terms within its ID with its respective value (I dont need both directions).

I really dont know where to start. I have tried looking at contigency tables, but nothing really looks like what I have. Any suggestions are really welcome.

enter image description here

question from:https://stackoverflow.com/questions/65905449/create-dataframe-of-relationships-per-id-from-to

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

I don't know whether this is the most efficient way or not. Using pands groupby

l = []
for group_name, group in df.groupby(['IDs']):
    group.reset_index(inplace=True)
    for i in range(0, len(group)):
        for j in range(i+1,len(group)):
            l.append([group['Terms'][i], group['Terms'][j], group['Values'][i]])
df2 = pd.DataFrame(l, columns=['From', 'To', 'Value'])

output

From    To  Value
0   a   b   100
1   a   d   200
2   c   f   300
3   c   g   300
4   f   g   300
5   a   h   400
6   i   j   500

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...