Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
183 views
in Technique[技术] by (71.8m points)

sql - Count days per month from days off table

I have table which stores person, start of holiday and stop of holiday.

I need to count from it, how many working days per month person was on holiday. So I want to partition this table over month. To get holidays I'm using: https://github.com/christopherthompson81/pgsql_holidays

Let's assume I have table for one person only with start/stop only.

create table data (id int, start date, stop date);

This is function for network_days I wrote:

CREATE OR REPLACE FUNCTION network_days(start_date date , stop_date date) RETURNS bigint AS $$
    SELECT count(*) FROM
        generate_series(start_date , stop_date - interval '1 minute' , interval '1 day') the_day
    WHERE
        extract('ISODOW' FROM the_day) < 6 AND the_day NOT IN (
        SELECT datestamp::timestamptz FROM holidays_poland (extract(year FROM o.start_date)::int, extract(year FROM o.stop_date)::int))
$$
LANGUAGE sql
STABLE;

and I created function with query like:

--$2 = 2020
SELECT
    month, year, sum(value_per_day)
    FROM (
        SELECT to_char(dt , 'mm') AS month, to_char(dt, 'yyyy') AS year, (network_days ((
            CASE WHEN EXTRACT(year FROM df.start_date) < 2020 THEN (SELECT date_trunc('year' , df.start_date) + interval '1 year')::date
            ELSE df.start_date END) , ( CASE WHEN EXTRACT(year FROM df.stop_date) > $2 THEN (date_trunc('year' , df.stop_date))::date
        ELSE
            df.stop_date END))::int ::numeric / count(*) OVER (PARTITION BY id))::int AS value_per_day
        FROM intranet.dayoff df
        LEFT JOIN generate_series((
            CASE WHEN EXTRACT(year FROM df.start_date) < $2 THEN (SELECT date_trunc('year' , df.start_date) + interval '1 year')::date ELSE df.start_date
            END) , (CASE WHEN EXTRACT(year FROM df.stop_date) > $2 THEN (date_trunc('year' , df.stop_date))::date
                ELSE df.stop_date END) - interval '1 day' , interval '1 day') AS t (dt) ON extract('ISODOW' FROM dt) < 6
    WHERE
        extract(isodow FROM dt) < 6 AND (EXTRACT(year FROM start_date) = $2 OR EXTRACT(year FROM stop_date) = $2)) t
GROUP BY month, year
ORDER BY month;

based on: https://dba.stackexchange.com/questions/237745/postgresql-split-date-range-by-business-days-then-aggregate-by-month?rq=1

and I almost have it: 10 rows returned

| month | year | sum  |
| ----- | ---- | ---- |
|   03  | 2020 |    2 |
|   04 |    2020 |  13 |
|   06 |    2020 |  1 |
|   11 |    2020 |  1 |
|   12 |    2020 |  2 |
|   05 |    2020 |  1 |
|   10 |    2020 |  2 |
|   08 |    2020 |  10 |
|   01 |    2020 |  1 | 
|   02 |    2020 |  1 |

so in function I created I'd need to add something like this

dt NOT IN (SELECT datestamp::timestamptz FROM holidays_poland ($2, $2))

but I end up with many conditions and I feel like this wrong approach. I feel like I should just somehow divide table from:

id  start               stop
1   31.12.2019 00:00:00 01.01.2020 00:00:00
2   30.03.2020 00:00:00 14.04.2020 00:00:00
3   01.05.2020 00:00:00 03.05.2020 00:00:00

to

start               stop
30.03.2020 00:00:00 01.01.2020 00:00:00
01.01.2020 00:00:00 14.04.2020 00:00:00
01.05.2020 00:00:00 03.05.2020 00:00:00

and just run network_days function for this date range, but I couldn't successfully partition my query of the table to get such result. What do you think is best way to achieve what I want to calculate?

question from:https://stackoverflow.com/questions/65933007/count-days-per-month-from-days-off-table

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

demo:db<>fiddle

SELECT
    gs::date
FROM person_holidays p,
    generate_series(p.start, p.stop, interval '1 day') gs  -- 1
WHERE gs::date NOT IN (SELECT holiday FROM holidays)       -- 2
    AND EXTRACT(isodow from gs::date) < 6                  -- 3
  1. Generate date series from person's start and stop date
  2. Exclude all dates from the holidays table
  3. If necessary: Exclude all weekend days (Saturday and Sunday)

Afterwards you are able to GROUP BY months and count the records:

SELECT
    date_trunc('month', gs),
    COUNT(*)
FROM person_holidays p,
    generate_series(p.start, p.stop, interval '1 day') gs
WHERE gs::date NOT IN (SELECT holiday FROM holidays)
    and extract(isodow from gs::date) < 6
GROUP BY 1

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...