Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
273 views
in Technique[技术] by (71.8m points)

postgresql - SQL (Postgres): ON vs WHERE for better performance

I have a person and a task table. The task table is much bigger and contains millions of rows.

With the following (fast) SQL statement I get a list of tasks for a given person:

select p.name, t.name
from person p
join task t on t.person_id = p.id 
where p.id = 1234

Unfortunately, I need to work with the following (equivalent but slow) SQL statement:

select p.name, t.name
from person p
join task t on 1 = 1
where t.id in (
  select t.id
  from task t
  where t.person_id = p.id )
and p.id = 1234

Why does the second statement take about 20 seconds, while the first only runs for less than a second. Can I use an index or something to speed up the second statement as well? I cannot change the SQL statement.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)
select p.name, t.name
from person p
join task t on t.person_id = p.id 
where p.id = 1234
  • This is INNER JOIN
  • It uses proper join conditions and p.id = 1234 further reduces the cost of the query

..

select p.name, t.name
from person p
join task t on 1 = 1
where t.id in (
  select t.id
  from task t
  where t.person_id = p.id )
and p.id = 1234
  • This is CROSS JOIN which will generate m*n records
  • Here, p.id = 1234 is used but still number of records will be (number of records satisfying p.id = 1234) * (Number of records in Tasks table)
  • IN sub-query is corelated query which takes much time as query will be executed once for each record.
  • Each expression/statements used in this query is costly. Hence, It is taking time.

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...