step-by-step demo:db<>fiddle
SELECT
s.uid, s.values,
sum(v.value) as score
FROM (
SELECT DISTINCT ON (uid)
uid,
CASE
WHEN cardinality(ok_count) > 0 THEN ok_count
ELSE ARRAY[last_value]
END as values
FROM (
SELECT
*,
ARRAY_AGG(groups_ids) FILTER (WHERE status = 'OK') OVER (PARTITION BY uid ORDER BY scheduled)as ok_count,
first_value(groups_ids) OVER (PARTITION BY uid ORDER BY year, month DESC) as last_value
FROM mytable
) s
ORDER BY uid, scheduled DESC
) s,
unnest(values) as u_group_id
JOIN (VALUES
(1, 100), (2, 80), (3, 60), (4, 50), (5,10), (6, 50), (7, 0)
) v(group_id, value) ON v.group_id = u_group_id
GROUP BY s.uid, s.values
Phew... quite complex. Let's have a look at the steps:
a)
SELECT
*,
-- 1:
ARRAY_AGG(groups_ids) FILTER (WHERE status = 'OK') OVER (PARTITION BY uid ORDER BY scheduled)as oks,
-- 2:
first_value(groups_ids) OVER (PARTITION BY uid ORDER BY year, month DESC) as last_value
FROM mytable
- Using the
array_agg()
window function to create an array of group_ids without loosing the other data as we would with simple GROUP BY
. The FILTER
clause is to put only the status = OK records into the array.
- Find the last group_id of a group (partition) using the
first_value()
window function. In descending order is returns the last value.
b)
SELECT DISTINCT ON (uid) -- 2
uid,
CASE -- 1
WHEN cardinality(ok_count) > 0 THEN ok_count
ELSE ARRAY[last_value]
END as values
FROM (
...
) s
ORDER BY uid, scheduled DESC -- 2
- The
CASE
clause either takes the previously created array (from step a1) or, if there is none, it takes the last value (from step a2), creates an one-elemented array.
- The
DISTINCT ON
clause returns only the first element of an ordered group. The group is your uid
and the order is given by the column scheduled
. Since you don't want the first, but last records within the group, you have to order it DESC
to make the most recent one the topmost record. That is taken by the DISTINCT ON
c)
SELECT
uid,
group_id
FROM (
...
) s,
unnest(values) as group_id -- 1
- The arrays should be extracted into one record per element. That helps to join the weighted values later.
d)
SELECT
s.uid, s.values,
sum(v.weighted_value) as score -- 2
FROM (
...
) s,
unnest(values) as u_group_id
JOIN (VALUES
(1, 100), (2, 80), ...
) v(group_id, weighted_value) ON v.group_id = u_group_id -- 1
GROUP BY s.uid, s.values -- 2
- Join your weighted value on the array elements. Naturally, this can be a table or query or whatever.
- Regroup the
uid
groups to calculate the SUM()
of the weighted_value
s
Additional note:
You should avoid duplicate data storing. You don't need to store the date parts year
, month
and day
if you also store the complete date. You can always calculate them from the date.