At EdgeTier, we use PostgreSQL databases for almost all of our relational data storage. PostgreSQL open-source, fast, has interesting text-search capabilities, has nice JSON parsing functionality, runs well on Amazon AWS. However, with complex applications such as WatchTower (a contact centre monitoring tool), we sometimes find ourselves in an SQL-optimisation hunt – working to speed up an API response, query result, or application process.
For active queries, PostgreSQL has a “statistics collector” subsystem that collates data on table, server, query, index, and connection activity. The database exposes information through a number of “pg_stat” predefined views and some lower-level statistic functions for power users.
Additionally, there are several system information functions that can extract session and system information on processes and queries that are currently ongoing.
The key tables and functions that I’ve found useful are:
- pg_stat_activity: A table with one entry per server process, showing details of the running query for each.
- pg_locks: Information on current locks held within the database by open transactions, with one row per lockable object.
- pg_blocking_pids(): A function that can find the process IDs (PIDs) of sessions that are blocking the PostgreSQL server process of a supplied PID.
- pg_cancel_backend(): Function that cancels the currently running query by sending a SIGINT to a process ID.
- pg_terminate_backend(): Terminate a backend process completely (the query and usually the connection) on the database (uses SIGTERM instead of SIGINT).
List long-running and slow queries on PostgreSQL
Poorly written queries or poorly structured data can cause very long execution times on your database. Typically discovered through slow response or extended increases in database CPU, the pg_stat_activity
view can help to find out what query is causing issues. The pg_stat_activity view contains details of all currently running queries, including user, connection, and timing details.
A simple select * from pg_stat_activity
will provide a snapshot of what is happening on your PostgreSQL database, with one line per current transaction, and the key columns:
- datname: The database name that the query is running on.
- pid: The Process ID of the backend for the running query. PID can be used with pg_terminate_backend() or pg_cancel_backend() separately.
- usename: user name that is running/connected to this backend
- client_addr: IP address of the connection to this backend. If this is empty/null, the row may refer to an internal connection or internal process.
- backend_start: Time when the backend process was started, or when the client connected.
- query_start: Time when the currently active query was started.
- state: The current status of the backend, with options:
- active: currently executing the query in the query column.
- idle: not executing anything, and waiting for a new command.
- idle in transaction: the backend is in a transaction, but not currently doing any work. This is the same as with “idle in transaction (error)” except one of the statements has an error.
- fastpath function call: Executing a fast-path function.
- disabled: track_activities has been disabled for this backend.
- wait_event & wait_event_type: The event type that the backend is waiting for. This can be an extensive list. Watch for locks on database objects here.
- query: The text of the backends most recent, or currently active, query. As per the documentation, this is truncated at 1024 bytes, changeable with “track_activity_query_size” in the server configuration.
- backend_type: There’s a few different types, but for most connections from external clients, this will be “client backend”.
Find queries running longer than 5 minutes
One useful addition to this is to show all queries that have been running for longer than five minutes for debugging purposes:
SELECT pid, user, pg_stat_activity.query_start, now() - pg_stat_activity.query_start AS query_time, query, state, wait_event_type, wait_event FROM pg_stat_activity WHERE (now() - pg_stat_activity.query_start) > interval '5 minutes';
Kill long-running PostgreSQL query processes
Where some queries look like they’re not going to finish, you can use the pid
(process ID) from the pg_stat_activity or pg_locks views to terminate the running process.
pg_cancel_backend(pid)
will attempt to gracefully kill a running query process.pg_terminate_backend(pid)
will immediately kill the running query process, but potentially have side affects across additional queries running on your database server. The full connection may be reset when running pg_terminate_backend, so other running queries can be affected. Use as a last resort.
Finding blocked processes and blocking queries
The pg_blocking_pids()
function is a useful shortcut to find the database connections / sessions that are blocking another session. The pg_blocking_pids() function returns an postgreSQL array of PIDs that are blocking the specified server process PID that you provide to the query. Typically, a server process blocks another if it holds a lock that is needed by the second process.
A useful function (from”How to detect query which holds the lock in Postgres?” Stackoverflow answer) to show the blocked processes on your database along with the actual query that is blocking them is:
SELECT activity.pid, activity.usename, activity.query, blocking.pid AS blocking_id, blocking.query AS blocking_query FROM pg_stat_activity AS activity JOIN pg_stat_activity AS blocking ON blocking.pid = ANY(pg_blocking_pids(activity.pid));
Viewing Locks on Tables – pg_lock
The pg_lock view provides information about the locks held by active processes in the database, with a single entry per lockable object (which can be tables, pages, transaction IDs, and other database objects). Locks are acquired as part of transactions, and are typically used where it is pertinent that the data in a table or database object is not changed before the full transaction is completed.
A simple select * from pg_locks
will immediately provide you with (key columns):
- locktype: The type of object that is locked – (see here) relation, page, object, tranasctionid, userlock etc.
- relation: OID (object ID) of the relation targeted by the lock. Join to the
pg_class
OID to get the names of the tables here. - page: page number targeted by the lock within the relation.
- transactionid xid: ID of the transaction targeted by the lock.
- pid: Process ID of the server process holding/waiting for this lock. Joining this column onto the
pg_stat_activity
view can provide all of the information above on the query / user / time etc. - granted: True if lock is held, false if the process is waiting.
Viewing locks with table names and queries
Joining the pg_lock
view to pg_stat_activity
for the query and pg_class
for the table names can be useful to gather more context as to what is locking on your database at any point in time (for more see “PostgreSQL find locks including the table name” on StackOverflow):
select relname as relation_name, query, pg_locks.* from pg_locks join pg_class on pg_locks.relation = pg_class.oid join pg_stat_activity on pg_locks.pid = pg_stat_activity.pid
Does the below query not include the ones in idle status as well? Is a query in an idle state considered an LRQ?
It is. And no, idle queries aren’t considered LRQ at all. When state is idle, the query shows information for the last executed query. For the purpose of the article -identify LRQ- the query should be modified to the following:
Btw, 5 minutes is an awfull long time.
[…] PostgreSQL: Find slow, long-running, and Blocked Queries […]
Use
pg_stat_activity.usename
rather thanuser
to get the user running the query rather than your own username.