Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
911 views
in Technique[技术] by (71.8m points)

mysql - JOIN, GROUP BY, ORDER BY

The problem I first had with the following query was that the group by clause was performed before the order by:

The saved.recipe_id column is an integer generated by UNIX_TIMESTAMP()

SELECT
    saved.recipe_id,
    saved.`date`,
    user.user_id
FROM saved
    JOIN user
        ON user.id = saved.user_id
GROUP BY saved.recipe_id
ORDER BY saved.`date` DESC

So I tried all sorts of different possible solution with sub queries and other bs. In the end I ended up with trying out some different sub queries in the join clause witch required me to change the table order from the from clause to the join clause. I decided to just try the following out:

SELECT
    saved.recipe_id,
    saved.`date`,
    user.user_id
FROM user
    JOIN saved
        ON user.id = saved.user_id
GROUP BY saved.recipe_id
ORDER BY saved.`date` DESC

For some reason this seems to order correctly, but why?
How can this change make my query sort more correctly then before?
Does it really? or is it just happen to do it for the test cases I put it up against?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

So the problem I first had with the following query was that the group by clause was performed before the order by:

This is not a problem. This is how SQL is defined and how it operates. The group by creates a new set of rows and order by orders those rows.

There is no ordering issue here. There is an "understanding of SQL" issue. Your order by is only ordering the results of the query. These results are produced by the group by, and the order o fthe joins has nothing to do with the results.

You are using a MySQL extension called Hidden Columns. This is when you have an aggregation query that has columns in the select (or having or order by clauses) that are not part of aggregation functions (sum(), etc) or part of the group by. Here is a quote from the documentation:

MySQL extends the use of GROUP BY so that the select list can refer to nonaggregated columns not named in the GROUP BY clause. This means that the preceding query is legal in MySQL. You can use this feature to get better performance by avoiding unnecessary column sorting and grouping. However, this is useful primarily when all values in each nonaggregated column not named in the GROUP BY are the same for each group. The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate. Furthermore, the selection of values from each group cannot be influenced by adding an ORDER BY clause. Sorting of the result set occurs after values have been chosen, and ORDER BY does not affect which values within each group the server chooses.

Presumably, you want the most recent date and user associated with that. The following query does what you want correctly and consistently:

SELECT saved.recipe_id, max(saved.`date`) as MostRecentDate,
       substring_index(group_concat(user.user_id), ',', 1) as MostRecentUser
FROM user JOIN
     saved
     ON user.id = saved.user_id
GROUP BY saved.recipe_id
ORDER BY max(saved.`date`) DESC;

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...