What is replication lag? How can we monitor Replication lag? Replication lag does not occur in most setups; however, it is important to monitor the entire endpoints of replication to ensure that our data is safe.
What is Replication Lag? Read More
The data can be replicated from the primary/master node to the replica/slave node after the transaction has been committed on the primary node. The replica is never ahead of the primary. It is usually a little behind. This delay is called lag.
Monitoring Primary Server
The system view to monitor Replication on the Primary node is pg_stat_replication.
\d pg_stat_replication View "pg_catalog.pg_stat_replication" Column | Type | Collation | Nullable | Default ------------------+--------------------------+-----------+----------+--------- pid | integer | | | usesysid | oid | | | usename | name | | | application_name | text | | | client_addr | inet | | | client_hostname | text | | | client_port | integer | | | backend_start | timestamp with time zone | | | backend_xmin | xid | | | state | text | | | sent_lsn | pg_lsn | | | write_lsn | pg_lsn | | | flush_lsn | pg_lsn | | | replay_lsn | pg_lsn | | | write_lag | interval | | | flush_lag | interval | | | replay_lag | interval | | | sync_priority | integer | | | sync_state | text | | | reply_time | timestamp with time zone | | |
Important Fields in pg_stat_replication
Let’s talk about the fields in detail.
- pid: This represents the process ID of the wal_receiver process in charge of this streaming connection. If you check the process table in your operating system, you should find a PostgreSQL process with exactly this number
- usesysid: OID of user which is used for Streaming replication.
- usename: This (not username; mind the missing “r”) stores the name of the user related to usesysid. It is what the client has put into the connection string
- application_name: Application name connected to master
- client_addr: This will tell you where the streaming connection comes from. It holds the IP address of the client.
- client_hostname: Hostname of standby.
- client_port: This is the TCP port number the client is using for communication with the particular WAL sender. -1 will be shown if local UNIX sockets are used.
- backend_start: Start time when SR connected to Master.
- backend_xmin: The transaction ID reported by hot_standby_feedback (which is the oldest transaction ID on a slave). It can make sense to monitor the difference between the current transaction ID on the master and the oldest one reported by the slave as well, to check whether there are unusually high differences
- state: Current WAL sender state i.e streaming
- sent_lsn: This represents the last transaction log position sent to the connection.
- write_lsn: Last transaction written on disk at standby.
- flush_lsn: This is the last location that was flushed to the standby system. Mind the difference between writing and flushing here. Writing does not imply flushing.
- replay_lsn: Last transaction flush on disk at standby.
- write_lag: Elapsed time during committed WALs from primary to the standby (but not yet committed in the standby)
- flush_lag: Elapsed time during committed WALs from primary to the standby (WAL’s has already been flushed but not yet applied)
- replay_lag: Elapsed time during committed WALs from primary to the standby (fully committed in standby node)
- sync_priority: Priority of standby server being chosen as synchronous standby
- sync_state: Sync State of standby (is it async or synchronous).
Below is a sample query output with a single replica:
When you see an entry in the system view, that means that there is one ACTIVE stream. Otherwise, there is none. You won’t see any entry.
If the values of all *_lsn columns are the same, that means that the Replica has caught up 100%.
sent_lsn: Last write-ahead log location sent on this connection (How much WAL has been sent over the network already?).
write_lsn: Last write-ahead log location written to the operating system (without flushing).
flush_lsn: Last write-ahead log location flushed to disk by this standby server.
replay_lsn: Last write-ahead log location replayed into the database on this standby server (Data that is already visible to end-users).
The following is an illustration of the data flow.
Monitoring Replica Servers
Replica server has incoming replication and remains in recovery mode so that it can replay the WALs/records as they come in. There is a view called pg_stat_wal_receiver.
\d pg_stat_wal_receiver View "pg_catalog.pg_stat_wal_receiver" Column | Type | Collation | Nullable | Default -----------------------+--------------------------+-----------+----------+--------- pid | integer | | | status | text | | | receive_start_lsn | pg_lsn | | | receive_start_tli | integer | | | written_lsn | pg_lsn | | | flushed_lsn | pg_lsn | | | received_tli | integer | | | last_msg_send_time | timestamp with time zone | | | last_msg_receipt_time | timestamp with time zone | | | latest_end_lsn | pg_lsn | | | latest_end_time | timestamp with time zone | | | slot_name | text | | | sender_host | text | | | sender_port | integer | | | conninfo | text | | |
Monitoring Replication Slots
The view pg_replication_slots gives one row for each replication slot in the primary. A replication slot guarantees that the WAL will not vanish if the replica is lagging behind. A primary will recycle its WAL as soon as it doesn’t need it on its own anymore if there are no replication slots.
\d pg_replication_slots View "pg_catalog.pg_replication_slots" Column | Type | Collation | Nullable | Default ---------------------+---------+-----------+----------+--------- slot_name | name | | | plugin | name | | | slot_type | text | | | datoid | oid | | | database | name | | | temporary | boolean | | | active | boolean | | | active_pid | integer | | | xmin | xid | | | catalog_xmin | xid | | | restart_lsn | pg_lsn | | | confirmed_flush_lsn | pg_lsn | | | wal_status | text | | | safe_wal_size | bigint | | | two_phase | boolean | | |
Cheers!