This is not an easy task. The accuracy depends on the number of the past observations. Such a small data as you share is does not give accurate solutions. However, the code below might give you an idea. As you guess, you need to find relation between the products and should use these relations. At below, I have firstly get the averaged price to see how the shopper generally tends to pay.
idx_0_0 = np.multiply(merged['week'] == 0,1) * np.multiply(merged['shopper'] == 0,1)
averaged_paid_price_0_0 = np.average(merged['price'][idx_0_0 == 1])
idx_0_0 = np.multiply(merged['week'] == 1,1) * np.multiply(merged['shopper'] == 0,1)
averaged_paid_price_1_0 = np.average(merged['price'][idx_0_0 == 1])
idx_0_0 = np.multiply(merged['week'] == 2,1) * np.multiply(merged['shopper'] == 0,1)
averaged_paid_price_2_0 = np.average(merged['price'][idx_0_0 == 1])
total_paid_average_0 = (averaged_paid_price_2_0 + averaged_paid_price_1_0 + averaged_paid_price_0_0)/3
Then I have divided the each product price by total_paid_average_0 as below
merged_price_points_0 = merged['price'] / total_paid_average_0
I am basically trying to give them points.
After all I have looked is there any relation between tendency of the shoppers and discount
idx_0_0_discount = np.multiply(merged['week'] == 0,1) * np.multiply(merged['discount'] != 0,1) * np.multiply(merged['shopper'] == 0,1)
discount_exist_0_0 = np.sum(idx_0_0_discount) / np.sum(np.multiply(merged['shopper'] == 0,1))
idx_0_0_discount = np.multiply(merged['week'] == 1,1) * np.multiply(merged['discount'] != 0,1) * np.multiply(merged['shopper'] == 0,1)
discount_exist_1_0 = np.sum(idx_0_0_discount) / np.sum(np.multiply(merged['shopper'] == 0,1))
idx_0_0_discount = np.multiply(merged['week'] == 2,1) * np.multiply(merged['discount'] != 0,1) * np.multiply(merged['shopper'] == 0,1)
discount_exist_2_0 = np.sum(idx_0_0_discount) / np.sum(np.multiply(merged['shopper'] == 0,1))
discount_point_0 = (discount_exist_0_0 + discount_exist_1_0 + discount_exist_2_0) / 3
Again, I have calculated points. After all I have tried to combine the all points.
You can find the all code at below.
import pandas as pd
import numpy as np
merged = pd.DataFrame({'week' : [0, 0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2],
'shopper' : [0, 0, 0, 1, 1, 0, 1, 1, 2, 0, 2, 2],
'product' : [63, 80, 91, 42, 77, 55, 77, 95, 77, 98, 202, 225],
'price' : [543, 644, 770, 620, 560, 354, 525, 667, 525, 654, 783, 662],
'discount' : [0, 0, 10, 12, 0, 30, 10, 0, 0, 5, 0, 0]
})
idx_0_0 = np.multiply(merged['week'] == 0,1) * np.multiply(merged['shopper'] == 0,1)
averaged_paid_price_0_0 = np.average(merged['price'][idx_0_0 == 1])
idx_0_0 = np.multiply(merged['week'] == 1,1) * np.multiply(merged['shopper'] == 0,1)
averaged_paid_price_1_0 = np.average(merged['price'][idx_0_0 == 1])
idx_0_0 = np.multiply(merged['week'] == 2,1) * np.multiply(merged['shopper'] == 0,1)
averaged_paid_price_2_0 = np.average(merged['price'][idx_0_0 == 1])
total_paid_average_0 = (averaged_paid_price_2_0 + averaged_paid_price_1_0 + averaged_paid_price_0_0)/3
idx_0_0 = np.multiply(merged['week'] == 0,1) * np.multiply(merged['shopper'] == 1,1)
averaged_paid_price_0_1 = np.mean(merged['price'][idx_0_0 == 1])
idx_0_0 = np.multiply(merged['week'] == 1,1) * np.multiply(merged['shopper'] == 1,1)
averaged_paid_price_1_1 = np.mean(merged['price'][idx_0_0 == 1])
idx_0_0 = np.multiply(merged['week'] == 2,1) * np.multiply(merged['shopper'] == 1,1)
averaged_paid_price_2_1 = np.mean(merged['price'][idx_0_0 == 1])
total_paid_average_1 = (averaged_paid_price_2_1 + averaged_paid_price_1_1 + averaged_paid_price_0_1)/3
idx_0_0 = np.multiply(merged['week'] == 0,1) * np.multiply(merged['shopper'] == 2,1)
averaged_paid_price_0_2 = np.mean(merged['price'][idx_0_0 == 1])
idx_0_0 = np.multiply(merged['week'] == 1,1) * np.multiply(merged['shopper'] == 2,1)
averaged_paid_price_1_2 = np.mean(merged['price'][idx_0_0 == 1])
idx_0_0 = np.multiply(merged['week'] == 2,1) * np.multiply(merged['shopper'] == 2,1)
averaged_paid_price_2_2 = np.mean(merged['price'][idx_0_0 == 1])
total_paid_average_2 = (averaged_paid_price_2_2 + averaged_paid_price_1_2 + averaged_paid_price_0_2)/3
merged_price_points_0 = merged['price'] / total_paid_average_0
idx_0_0_discount = np.multiply(merged['week'] == 0,1) * np.multiply(merged['discount'] != 0,1) * np.multiply(merged['shopper'] == 0,1)
discount_exist_0_0 = np.sum(idx_0_0_discount) / np.sum(np.multiply(merged['shopper'] == 0,1))
idx_0_0_discount = np.multiply(merged['week'] == 1,1) * np.multiply(merged['discount'] != 0,1) * np.multiply(merged['shopper'] == 0,1)
discount_exist_1_0 = np.sum(idx_0_0_discount) / np.sum(np.multiply(merged['shopper'] == 0,1))
idx_0_0_discount = np.multiply(merged['week'] == 2,1) * np.multiply(merged['discount'] != 0,1) * np.multiply(merged['shopper'] == 0,1)
discount_exist_2_0 = np.sum(idx_0_0_discount) / np.sum(np.multiply(merged['shopper'] == 0,1))
discount_point_0 = (discount_exist_0_0 + discount_exist_1_0 + discount_exist_2_0) / 3
merged_price_points_0 = merged_price_points_0.T
points_list = list()
total_point = list()
for counter in range(len(merged['product'])):
if merged['discount'][counter] != 0:
points_list.append(discount_point_0)
else:
points_list.append(0)
if merged_price_points_0[counter] > 1:
merged_price_points_0[counter] = merged_price_points_0[counter] - 1
else:
merged_price_points_0[counter] = 1-merged_price_points_0[counter]
total_point.append(merged_price_points_0[counter] +points_list[counter] )
sum_of_points = np.sum(total_point)
possibility_of_product_week3_for_0 = total_point / sum_of_points
print("Possibility of 3th Week for 0")
for counter in range(len(merged['product'])):
print(str(merged['product'][counter]) + "||" + str(possibility_of_product_week3_for_0[counter]))
Output
Possibility of 3th Week for 0
63||0.005959173323190062
80||0.05166730062127548
91||0.18671231139850392
42||0.10112843920375306
77||0.0037403321922150397
55||0.1769494104222138
77||0.07938379612019782
95||0.06479016102447063
77||0.01622923798656016
98||0.12052745023456325
202||0.13097502218841128
225||0.06193736528464565
I would suggest to search for what Chris commented on. This is not solid answer but might give you an idea. The main idea; defining relationship between products and why the shopper buy it, and giving them points.