Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
296 views
in Technique[技术] by (71.8m points)

sql - postgresql 9.4 - prevent app selecting always the latest updated rows

I have a (Ruby on Rails 4) app and using a postgresql query that basically look into a Opportunity table, search randomly among the rows those where opportunity is 'available=true' and update these selected rows with ' available = false'. Each time a user clicks on 'try an opportunity' the app uses the query below.

UPDATE opportunities s
SET    opportunity_available = false
FROM  (
   SELECT id
   FROM   opportunities
   WHERE  deal_id = #{@deal.id}
   AND    opportunity_available
   AND    pg_try_advisory_xact_lock(id)
   LIMIT  1
   FOR    UPDATE
   ) sub
WHERE     s.id = sub.id
RETURNING s.prize_id, s.id;
// inspired by https://stackoverflow.com/questions/33128531/put-pg-try-advisory-xact-lock-in-a-nested-subquery

I've been struggling for 5 days but I have managed to understand now roughly how it's behaving (badly): now I need to know how to change it.

Indeed, the app "becomes a little crazy" and select (and update the row) exactly by the latest I updated.

Let me be clear on my process on how I create these opportunities

  • I as admin create 20 opportunities (rows) via my app as an admin

  • then in my admin panel, I create a prize which a quantity let's say prize id =45 with quantity = 4. the app will go in Opportunity table and fill randomly (this part works perfectly) 4 rows with prize_id= 45.

Now a user runs the app and as explained above, the app will not choose randomly a row but will begin always by the latest updated: it seems the takes one of the row with a prize, then another with a prize, then another and never those without prizes (with prized_id= empty)...

Now I made some manual experiments: I tried manually change the values on my table (via pgadmin), the weirdest thing happen: if I modify row 32, then row 45, then row 67, when the user tries to play again, guess what, the lines that are picked randomly are exactly the ones I updated in the reverse order: it will pick line 67 then line45 then line 32.. It won't even consider the other rows that could be chosen (all the others opportunities where available= true).

I also tried not to use for update or the 'pg_try_advisory_xact_lock(id)' line and it seems it still has the same issue.

As, as an admin, I create first the total 20 rows, THEN the 4 winning rows, they're the last to be udpated (even if on my pgadmin screen they stay on the same rows...maybe in the backround, postgresql is ordering them as the last updated consecutively and selecting them?) and that's kind of why as soon as one of 4 these winning rows is selected, then all the others follow.

To be clear I would be OK with the app going row by row to select each opportunity available (ex: row 3, then row 4, then row 5 as per what I see on pgadmin as rows are already attributed the prizes totally randomly). the problem is it's not doing this, it's taking often all the winning rows consecutively....

I'm speechless and no clue on how to break this pattern.

Note: this pattern is not happening continuously 100% of the time but very often: for example if i have more than 4 winning rows, it's like for 2 minutes if I keep clicking as a user it behaves as said here, then it stops and seems (or I might be wrong) to behave normally then again after & minute will again only select winning rows...

EDIT 1

Here is how prizes are injected inside Opportunity table (example whehn I create a prize which id is 21 and quantity of this prize= 3)=> it randomly send them (as far as I understand) but only where there is not already a prize_id (i.e. if opportunity has prize_id=empty, it can put it there)

  SQL (2.4ms)  
  UPDATE "opportunities"
  SET "prize_id" = 21
  WHERE "opportunities"."id" 
  IN (
    SELECT "opportunities"."id"
    FROM "opportunities"
    WHERE (deal_id = 341 AND prize_id IS NULL)
    ORDER BY RANDOM()
    LIMIT 3) //
   (0.9ms)  COMMIT

This SQL query is generated by a Rails gem (called Randumb: github.com/spilliton/randumb)

EDIT 2

I moved to another more precise question: Postgresql 9.4 - FASTEST query to select and update on large dataset (>30M rows) with heavy writes/reads and locks

Indeed I think the problem is really that I need a truly RANDOM pick and move away from ARBITRARY pick.

It was already said by Erwin here Advisory locks or NOWAIT to avoid waiting for locked rows? but now I understand what he meant ("Arbitrary" also matters because it implies that Postgresql will typically pick the same row for the same query arbitrarily, which makes lock contention a much bigger issue than truly random picks would. " postgresql is free to choose the fastest way to output the 'LIMIT 1' and it's picking always the same rows= those which have been upated the last ones. But I can't do with this as the latest updated are all Winning opportunities.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Just an idea: instead of calling random() use it as default value for a column(which can be indexed) A similar way could use a serial with an increment of about 0.7 * INT_MAX.

i tmp.sql

CREATE TABLE opportunities
    ( id SERIAL NOT NULL PRIMARY KEY
    , deal_id INTEGER NOT NULL DEFAULT 0
    , prize_id INTEGER
    , opportunity_available boolean NOT NULL DEFAULT False
            -- ----------------------------------------
            -- precomputed random() , (could be indexed)
    , magic DOUBLE precision NOT NULL default RANDOM()
    );

INSERT INTO opportunities(deal_id)
SELECT 341
FROM generate_series(1,20) gs
    ;
VACUUM ANALYZE opportunities;

PREPARE add_three (integer) AS (
WITH zzz AS (
  UPDATE opportunities
  SET prize_id = 21
    , opportunity_available = True
    -- updating magic is not *really* needed here ...
    , magic = random()
  WHERE opportunities.id
  IN (
    SELECT opportunities.id
    FROM opportunities
    WHERE (deal_id = $1 AND prize_id IS NULL)
    -- ORDER BY RANDOM()
    ORDER BY magic
    LIMIT 3)
RETURNING id, magic
    ) -- 
SELECT * FROM zzz
    );

PREPARE draw_one (integer) AS (
  WITH upd AS (
  UPDATE opportunities s
  SET    opportunity_available = false
  FROM  (
     SELECT id
     FROM   opportunities
     WHERE  deal_id = $1
     AND    opportunity_available
     AND    pg_try_advisory_xact_lock(id)
     ORDER BY magic
     LIMIT  1

     FOR    UPDATE
     ) sub
  WHERE     s.id = sub.id
  RETURNING s.prize_id, s.id, magic
    )
SELECT * FROM upd
    );

SELECT * FROM opportunities;

echo add3
EXECUTE add_three(341);
SELECT * FROM opportunities;

echo add3 more
EXECUTE add_three(341);
SELECT * FROM opportunities;

echo draw1
EXECUTE draw_one(341);
SELECT * FROM opportunities;

echo draw2
EXECUTE draw_one(341);
SELECT * FROM opportunities;

VACUUM ANALYZE opportunities;

echo draw3
EXECUTE draw_one(341);
SELECT * FROM opportunities;

echo draw4
EXECUTE draw_one(341);
SELECT * FROM opportunities;

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

2.1m questions

2.1m answers

60 comments

57.0k users

...