Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.3k views
in Technique[技术] by (71.8m points)

datetime - MySQL: group by consecutive days and count groups

I have a database table which holds each user's checkins in cities. I need to know how many days a user has been in a city, and then, how many visits a user has made to a city (a visit consists of consecutive days spent in a city).

So, consider I have the following table (simplified, containing only the DATETIMEs - same user and city):

      datetime
-------------------
2011-06-30 12:11:46
2011-07-01 13:16:34
2011-07-01 15:22:45
2011-07-01 22:35:00
2011-07-02 13:45:12
2011-08-01 00:11:45
2011-08-05 17:14:34
2011-08-05 18:11:46
2011-08-06 20:22:12

The number of days this user has been to this city would be 6 (30.06, 01.07, 02.07, 01.08, 05.08, 06.08).

I thought of doing this using SELECT COUNT(id) FROM table GROUP BY DATE(datetime)

Then, for the number of visits this user has made to this city, the query should return 3 (30.06-02.07, 01.08, 05.08-06.08).

The problem is that I have no idea how shall I build this query.

Any help would be highly appreciated!

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

You can find the first day of each visit by finding checkins where there was no checkin the day before.

select count(distinct date(start_of_visit.datetime))
from checkin start_of_visit
left join checkin previous_day
    on start_of_visit.user = previous_day.user
    and start_of_visit.city = previous_day.city
    and date(start_of_visit.datetime) - interval 1 day = date(previous_day.datetime)
where previous_day.id is null

There are several important parts to this query.

First, each checkin is joined to any checkin from the previous day. But since it's an outer join, if there was no checkin the previous day the right side of the join will have NULL results. The WHERE filtering happens after the join, so it keeps only those checkins from the left side where there are none from the right side. LEFT OUTER JOIN/WHERE IS NULL is really handy for finding where things aren't.

Then it counts distinct checkin dates to make sure it doesn't double-count if the user checked in multiple times on the first day of the visit. (I actually added that part on edit, when I spotted the possible error.)

Edit: I just re-read your proposed query for the first question. Your query would get you the number of checkins on a given date, instead of a count of dates. I think you want something like this instead:

select count(distinct date(datetime))
from checkin
where user='some user' and city='some city'

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
...