Query Progress Bar

If this were YouTube, the thumbnail would look something like this

Imagined YouTube Thumbnail

But seriously. Is there any way to monitor long running Postgres queries?

Kinda!

Using sequences.

The Setup

We create two tables. One small and one big.

DROP TABLE IF EXISTS small;
CREATE TABLE small (
  some_val int
);

INSERT INTO small
SELECT
  gs
FROM
  generate_series(1, 100) AS gs;

DROP TABLE IF EXISTS big;
CREATE TABLE big (
  other_val int
);

INSERT INTO big
SELECT
  gs % 100 + 1
FROM
  generate_series(1, 50000000) AS gs;

CREATE INDEX ON big (other_val);

VACUUM ANALYZE small;
VACUUM ANALYZE big;

The Query

This is the slow query we'll try to monitor

SELECT
  some_val,
  (
    SELECT
      count(*)
    FROM
      big AS b
    WHERE
      b.other_val = s.some_val
  )
FROM
  small AS s;

The Technique

Create a dummy sequence

DROP SEQUENCE IF EXISTS my_sequence;
CREATE SEQUENCE my_sequence;

Find a way to incorporate a call to nextval for this sequence

nextval('my_sequence')

Easy enough.

SELECT
  some_val,
  (
    SELECT
      count(*)
    FROM
      big AS b
    WHERE
      b.other_val = s.some_val
  ),
  nextval('my_sequence')
FROM
  small AS s;

Now, while the slow query is running we can open another psql session and run our 'monitor' in a \watch and see how quickly Postgres is chewing through rows. Sequnces aren't transactional, which is why this works. When we see the values from nextval in our watch start shooting up, that's Postgres working its way through the query.

SELECT nextval('my_sequence');
\watch 1

The Video Content

Query monitoring animation

Not Perfect

Unfortunately this won't work for every query. Postgres needs to evaluate the call to nextval in the 'meat' of the execution for it to work, and Postgres won't always choose to do that. It won't work on this query, for example

SELECT
  some_val,
  count(*),
  nextval('my_sequence')
FROM
  small AS s INNER JOIN
  big AS b ON
    s.some_val = b.other_val
GROUP BY
  some_val;

More Queries

It's nice when it works though. It can work for UPDATE statements

UPDATE
  big
SET
  other_val = other_val + 1 + nextval('my_sequence') * 0;

But you could get the math wrong doing it that way and end up changing the logic of the UPDATE. Better to use the RETURNING clause

UPDATE
  big
SET
  other_val = other_val + 1
RETURNING
  nextval('my_sequence');

DELETE statements can work the same way

DELETE FROM
  big
RETURNING
  nextval('my_sequence');

INSERT statements can use RETURNING the same way as well

INSERT INTO big
SELECT
  gs % 100
FROM
  generate_series(1, 25000000) AS gs
RETURNING
  nextval('my_sequence');

Bonus Trick

There's another way you can monitor INSERT statements that doesn't involve tacking on a RETURNING clause or modifying the INSERT statement in any way. This use case might be pretty rare, but I once wanted to monitor an INSERT statement and I couldn't easily modify the query. The trick was to add a dummy column to the table and make it use the sequence by default.

ALTER TABLE big ADD COLUMN dummy_value int;
ALTER TABLE big ALTER COLUMN dummy_value SET DEFAULT nextval('my_sequence');

INSERT INTO big
SELECT
  gs % 100 + 1
FROM
  generate_series(1, 25000000) AS gs;

This was in a non-prod environment so adding a dummy column like that was perfectly OK in that case. Segue to final section

Caveat City.

Here are just a few
  • It won't work for every query
  • It won't work on read replicas
  • Don't use this in prod. Think about how it would make you feel. Creating random sequences that you would forget to clean up. Running hand edited queries that haven't gotten the test coverage the rest of your application gets. The answer is bad, it would make you feel bad.
  • Even in lower environments you still may not have permission to create the necessary sequences