If this were YouTube, the thumbnail would look something like this
But seriously. Is there any way to monitor long running Postgres queries?
We create two tables. One small and one big.
DROP TABLE IF EXISTS small; CREATE TABLE small ( some_val int ); INSERT INTO small SELECT gs FROM generate_series(1, 100) AS gs; DROP TABLE IF EXISTS big; CREATE TABLE big ( other_val int ); INSERT INTO big SELECT gs % 100 + 1 FROM generate_series(1, 50000000) AS gs; CREATE INDEX ON big (other_val); VACUUM ANALYZE small; VACUUM ANALYZE big;
This is the slow query we'll try to monitor
SELECT some_val, ( SELECT count(*) FROM big AS b WHERE b.other_val = s.some_val ) FROM small AS s;
Create a dummy sequence
DROP SEQUENCE IF EXISTS my_sequence; CREATE SEQUENCE my_sequence;
Find a way to incorporate a call to
nextval for this sequence
SELECT some_val, ( SELECT count(*) FROM big AS b WHERE b.other_val = s.some_val ), nextval('my_sequence') FROM small AS s;
Now, while the slow query is running we can open another psql session and run our 'monitor' in a
\watch and see how quickly Postgres is chewing through rows. Sequnces aren't transactional, which is why this works. When we see the values from
nextval in our watch start shooting up, that's Postgres working its way through the query.
SELECT nextval('my_sequence'); \watch 1
The Video Content
Unfortunately this won't work for every query. Postgres needs to evaluate the call to
nextval in the 'meat' of the execution for it to work, and Postgres won't always choose to do that. It won't work on this query, for example
SELECT some_val, count(*), nextval('my_sequence') FROM small AS s INNER JOIN big AS b ON s.some_val = b.other_val GROUP BY some_val;
It's nice when it works though. It can work for UPDATE statements
UPDATE big SET other_val = other_val + 1 + nextval('my_sequence') * 0;
But you could get the math wrong doing it that way and end up changing the logic of the UPDATE. Better to use the RETURNING clause
UPDATE big SET other_val = other_val + 1 RETURNING nextval('my_sequence');
DELETE statements can work the same way
DELETE FROM big RETURNING nextval('my_sequence');
INSERT statements can use RETURNING the same way as well
INSERT INTO big SELECT gs % 100 FROM generate_series(1, 25000000) AS gs RETURNING nextval('my_sequence');
There's another way you can monitor INSERT statements that doesn't involve tacking on a RETURNING clause or modifying the INSERT statement in any way. This use case might be pretty rare, but I once wanted to monitor an INSERT statement and I couldn't easily modify the query. The trick was to add a dummy column to the table and make it use the sequence by default.
ALTER TABLE big ADD COLUMN dummy_value int; ALTER TABLE big ALTER COLUMN dummy_value SET DEFAULT nextval('my_sequence'); INSERT INTO big SELECT gs % 100 + 1 FROM generate_series(1, 25000000) AS gs;
This was in a non-prod environment so adding a dummy column like that was perfectly OK in that case. Segue to final section
- Here are just a few
- It won't work for every query
- It won't work on read replicas
- Don't use this in prod. Think about how it would make you feel. Creating random sequences that you would forget to clean up. Running hand edited queries that haven't gotten the test coverage the rest of your application gets. The answer is bad, it would make you feel bad.
- Even in lower environments you still may not have permission to create the necessary sequences