What is Checkpoint?
Checkpoint periodically flushes all dirty buffers and creates checkpoint record in the WAL log, which is used for recovery.
The purpose of the checkpoint is to ensure that all the dirty buffers generated up to a certain point are sent to the disk. This will recycle the WAL up to that point. Read More
Journey of a PostgreSQL Write
BG Writer (Background Writer)
The BGWriter helps to diminish the I/O operations needed to be done by Checkpoints. This stabilizes the system’s I/O. BGWriter writes Dirty Buffers to disk when the system load is low to minimize the impact on active transactions. It constantly monitors the DB Buffer Pol and makes decisions on when to write dirty buffers to disk and which buffers to free up for reuse.
Frequency of Checkpoints
Use the below query to find the frequency of the checkpoints:
WITH sub as ( SELECT EXTRACT( EPOCH FROM (now() - stats_reset) ) AS seconds_since_start, ( checkpoints_timed + checkpoints_req ) AS total_checkpoints FROM pg_stat_bgwriter ) SELECT total_checkpoints, seconds_since_start / total_checkpoints / 60 AS minutes_between_checkpoints FROM sub;
There are two important columns checkpoints_req, checkpoints_timed.
If the checkpoints_req is more than the checkpoints_timed, PostgreSQL is doing checkpoints due to the high WAL generation.
If the checkpoints are happening frequently it will cause more IO load on the machine so increase the max_wal_size parameter.
checkpoints_req > checkpoints_timed = (bad). This means that PostgreSQL is doing checkpoints due to the high WAL generation
The basic rule is simple – checkpoints_timed value should be much higher than checkpoints_req. It’s best when the last one (checkpoints_req) is near zero. It may be obtained by increasing max_wal_size (or checkpoint_segments) and checkpoint_timeout. A good initial point is to set max_wal_size to 10GB and checkpoint_timeout = 30min. Also, checkpoint_completion_target should be configured in a way that will enable spread of execution of checkpoints to a time that is the closest to timeout or to the size of collected WAL. A good starting point for this is 0.9 (default in PostgreSQL 14 and above). Thus, with these settings checkpoint will occur when Postgres collects 10GB of WAL, or after 30 minutes from the last checkpoint. The already running checkpoint’s execution will spread over 27 minutes or until Postgres again collects 9GB of WAL.
Cheers!
Leave a Reply