Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.0k views
in Technique[技术] by (71.8m points)

python - Pandas find all combinations of rows under a budget

I am trying to figure out a way to determine all possible combinations of rows within a DataFrame that are below a budget, so let's say I have a dataframe like this:

data = [['Bread', 9, 'Food'], ['Shoes', 20, 'Clothes'], ['Shirt', 15, 'Clothes'], ['Milk', 5, 'Drink'], ['Cereal', 8, 'Food'], ['Chips', 10, 'Food'], ['Beer', 15, 'Drink'], ['Popcorn', 3, 'Food'], ['Ice Cream', 6, 'Food'], ['Soda', 4, 'Drink']]
df = pd.DataFrame(data, columns = ['Item', 'Price', 'Type'])
df

Data

Item       Price  Type
Bread      9      Food
Shoes      20     Clothes
Shirt      15     Clothes
Milk       5      Drink
Cereal     8      Food
Chips      10     Food
Beer       15     Drink
Popcorn    3      Food
Ice Cream  6      Food
Soda       4      Drink

I want to find every combination that I could purchase for under a specific budget, let's say $35 for this example, while only getting one of each type. I'd like to get a new dataframe made up of rows for each combination that works with each item in its own column.

I was trying to do it using itertools.product, but this can combine and add columns, but what I really need to do is combine and add a specific column based on values in another column. I'm a bit stumped now.

Thanks for your help!

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Here a way using powerset recipe from itertools with pd.concat

from itertools import chain, combinations

def powerset(iterable):
    "powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
    s = list(iterable)
    return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))

df_groups = pd.concat([df.reindex(l).assign(grp=n) for n, l in 
                       enumerate(powerset(df.index)) 
                       if (df.loc[l, 'Price'].sum() <= 35)])

Outputs a single dataframe with groups of product that meet $35 condition:

          Item  Price     Type  grp
0       Bread      9     Food    1
1       Shoes     20  Clothes    2
2       Shirt     15  Clothes    3
3        Milk      5    Drink    4
4      Cereal      8     Food    5
..        ...    ...      ...  ...
3        Milk      5    Drink  752
4      Cereal      8     Food  752
7     Popcorn      3     Food  752
8   Ice Cream      6     Food  752
9        Soda      4    Drink  752

How many ways this came combined to meet $35 budget?

df_groups['grp'].nunique()

Output:

258

Details:

There are a couple of tricks/methods that are used here. First, we are using the index of the dataframe to create groups of rows or items using powerset. Next, we are using enumerate to identify each group and with assign creating a new column in a dataframe with that group number from enumerate.

Modify to capture no more than one of each type:

df_groups = pd.concat([df.reindex(l).assign(grp=n) for n, l in 
                       enumerate(powerset(df.index)) 
                       if ((df.loc[l, 'Price'].sum() <= 35) & 
                           (df.loc[l, 'Type'].value_counts()==1).all())])

How many groups?

df_groups['grp'].nunique()
62

Get exactly one for each Type:

df_groups = pd.concat([df.reindex(l).assign(grp=n) for n, l in 
                       enumerate(powerset(df.index)) 
                       if ((df.loc[l, 'Price'].sum() <= 35) & 
                           (df.loc[l, 'Type'].value_counts()==1).all()&
                           (len(df.loc[l, 'Type']) == 3))])

How many groups?

df_groups['grp'].nunique()
21

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...