Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
707 views
in Technique[技术] by (71.8m points)

sql server - Why does a Recursive CTE in Transact-SQL require a UNION ALL and not a UNION?

I get that an anchor is necessary, that makes sense. And I know that a UNION ALL is needed, if your recursive CTE doesn't have one, it just doesn't work... but I can't find a good explanation of why that is the case. All the documentation just states that you need it.

Why can't we use a UNION instead of a UNION ALL in a recursive query? It seems like it would be a good idea to not include duplicates upon deeper recursion, doesn't it? Something like that should already be working under the hood already, I would think.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

I presume the reason is that they just haven't considered this a priority feature worth implementing. It looks like Postgres does support both UNION and UNION ALL.

If you have a strong case for this feature you can provide feedback at Connect (or whatever the URL of its replacement will be).

Preventing duplicates being added could be useful as a duplicate row added in a later step to a previous one will nearly always end up causing an infinite loop or exceeding the max recursion limit.

There are quite a few places in the SQL Standards where code is used demonstrating UNION such as below

enter image description here

This article explains how they are implemented in SQL Server. They aren't doing anything like that "under the hood". The stack spool deletes rows as it goes so it wouldn't be possible to know if a later row is a duplicate of a deleted one. Supporting UNION would need a somewhat different approach.

In the meantime you can quite easily achieve the same in a multi statement TVF.

To take a silly example below (Postgres Fiddle)

WITH R
     AS (SELECT 0 AS N
         UNION
         SELECT ( N + 1 )%10
         FROM   R)
SELECT N
FROM   R 

Changing the UNION to UNION ALL and adding a DISTINCT at the end won't save you from the infinite recursion.

But you can implement this as

CREATE FUNCTION dbo.F ()
RETURNS @R TABLE(n INT PRIMARY KEY WITH (IGNORE_DUP_KEY = ON))
AS
  BEGIN
      INSERT INTO @R
      VALUES      (0); --anchor

      WHILE @@ROWCOUNT > 0
        BEGIN
            INSERT INTO @R
            SELECT ( N + 1 )%10
            FROM   @R
        END

      RETURN
  END

GO

SELECT *
FROM   dbo.F () 

The above uses IGNORE_DUP_KEY to discard duplicates. If the column list is too wide to be indexed you would need DISTINCT and NOT EXISTS instead. You'd also probably want a parameter to set the max number of recursions and avoid infinite loops.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...