Sample Log Output:
Description:
When the standby server receives updates/deletes in the WAL stream that will result in invalidating data currently being accessed by a running query, this error will happen. Read More
Here is a script to check table level locks
SELECT act1.query as blocking_query, act2.query as blocked_query, l1.pid AS blocked_pid, l2.pid AS blocking_pid, l1.relation :: regclass FROM pg_locks l1, pg_locks l2, pg_stat_activity act1, pg_stat_activity act2 WHERE l1.granted = true AND l2.granted = false AND l1.pid = act1.pid AND l2.pid = act2.pid AND l1.relation = l2.relation;
Cheers!
Here is a script to get the Top 10 Time-Consuming Queries (This will work for PostgreSQL version 13 and above)
We will make use of the pg_stat_statements view. This view will tell us which types of queries are slow and how often these queries are called.
In AWS RDS, this module is already installed by default. So, all you need to do is to create the extension before running the script.
create extension pg_stat_statements;
In Tencent DB for PostgreSQL, both module and the extension are enabled by default. So, there is no need to do anything before running the query.
Disable expanded display
\x
Script to get Top 10 Time Consuming Queries
SELECT t2.rolname, t3.datname, queryid, calls, ( total_plan_time + total_exec_time ) / 1000 as total_time_seconds, (min_plan_time + min_exec_time) / 1000 as min_time_seconds, (max_plan_time + max_exec_time) / 1000 as max_time_seconds, (mean_plan_time + mean_exec_time) / 1000 as mean_time_seconds, ( stddev_plan_time + stddev_exec_time ) / 1000 as stddev_time_seconds, rows, shared_blks_hit, shared_blks_read, shared_blks_dirtied, shared_blks_written, local_blks_hit, local_blks_read, local_blks_dirtied, local_blks_written, temp_blks_read, temp_blks_written, blk_read_time / 1000 as blk_read_time_seconds, blk_write_time / 1000 as blk_write_time_seconds FROM pg_stat_statements t1 JOIN pg_roles t2 ON (t1.userid = t2.oid) JOIN pg_database t3 ON (t1.dbid = t3.oid) WHERE t2.rolname != 'rdsadmin' AND queryid IS NOT NULL;
Cheers!
In this blog post, I’m gonna talk about MySQL Memory Issues. What causes High memory usage and what is the impact on MySQL database instances.
But first, we can do a quick review on the following topics:
Private memory is used to allocate cache upon connection to MySQL server. It is allocated for each thread. For example; sort buffer, join buffer, and temporary table.
InnoDB maintains one or more buffer pools that cache frequently used data and indexes in the main memory.
So, when a read query is executed from the client program, InnoDB checks to see if the required data pages are in the buffer pool. If it is not in the buffer pool, InnoDB requests the data from the tablespace. Then it will put the data pages in the buffer pool. And then, MySQL will return the results to the client.
innodb_buffer_pool_size is the most important tuning parameter. It caches table and index data. The buffer pool permits frequently used data to be accessed directly from memory, which speeds up processing.
Setting it too low can degrade the performance. Setting it too high can increase the memory consumption causing the DB to crash.
MySQL allocation is not only from innodb_buffer_pool_size, but also from other buffers in the database such sort_buffer_size, read_buffer_size, read_rnd_buffer, join_buffer_size, and tmp_table_size and this will require additional 5-10% extra memory
Below is a formula to calculate the approximate memory usage for your MySQL:
Maximum MySQL Memory Usage = innodb_buffer_pool_size + key_buffer_size + ((read_buffer_size + read_rnd_buffer_size + sort_buffer_size + join_buffer_size) X max_connections)
Swapping can happen when a system requires more memory than is allocated .A large buffer may lead to swapping in the operating system and make the performance slow.
To avoid your MySQL/MariaDB data being SWAP instead of RAM, you have to play with a kernel parameter called swappiness.
A swappiness value is used to change the balance between swapping out runtimememory and dropping pages from the system page cache. The higher the value, the more the system will swap. The lower the value, the less the system will swap. The maximum value is 100, the minimum is 0, and 60 is the default.
I performed a test in AWS RDS MySQL. I ran a stored procedure that consumed a lot of memory until the server crashed. We will notice that when memory gets low, Swap usage will increase. We will also see a spike in disk IO usage.
When MySQL crashed, we will lose connection to the database. In AWS RDS, we can see the logs and EVENTs to identify the cause of the crash. The log indicates that the database process was killed by the OS due to excessive memory consumption.
I also did the same test in Azure MySQL. The difference between Azure and AWS is that when the DB crash, in Azure, it will failover to the Standby replica, but in AWS, a mysqld process crash would not trigger a failover.
In Azure, under Private DNS Zone, if we see that the IP address changed, then it means that it failed over to the standby replica successfully.
Cheers!
innochecksum prints checksums for InnoDB files. This tool reads an InnoDB tablespace file, calculates the checksum for each page, compares the calculated checksum to the stored checksum, and reports mismatches, which indicate damaged pages.
It was originally developed to speed up verifying the integrity of tablespace files after power outages but can also be used after file copies. Because checksum mismatches cause InnoDB to deliberately shut down a running server, it may be preferable to use this tool rather than waiting for an in-production server to encounter the damaged pages.
Below is a script you can use to get corrupted tables. Replace the directory path where your data files are located. (Stop MySQL/MariaDB service first, before running it)
INNOCKSM_LOG=/mysql/backup/innochecksum_`date +%Y%m%d_%H%M%S`.log for DB in `ls -1vd /mysql/data/*/ | grep -wv '/mysql/data/mysql/\|/mysql/data/performance_schema/\|/mysql/data/lost+found/'` do for IBD in `ls -1v $DB | grep .ibd` do innochecksum ${DB}${IBD} if [ $? -ne 0 ]; then echo ${DB}${IBD} >> $INNOCKSM_LOG innochecksum ${DB}${IBD} >> $INNOCKSM_LOG 2>&1 fi done done
Cheers!
In this blog post, we are going to discuss MariaDB crash issues due to out-of-memory (OOM), and how to troubleshoot them.
When you get this error, it clearly indicates that the server is running out of memory. In other words, the allocated memory to MariaDB is not enough to handle the process requirements.
If memory is insufficient, then we just have to increase RAM, right? Well, this is certainly one solution, but it is the easiest way out. Simply adding RAM is neither the ideal solution nor a cheaper way. A memory-intensive query or process can eventually eat up the added memory, and can lead to errors in a short span of time.
Common Causes of Out-Of-Memory Issues
Here are the ways to troubleshoot
Cheers!
In our environment where we have thousands of MariaDB servers, 95% of the reason why replication is slow or lagging is because there are many tables that do not have primary key.
Below is an example of what we can see in show processlist when the SQL Thread is lagging because many tables do not have primary keys.
Below is a script to check tables that do not have a primary key.
SELECT tables.table_schema, tables.table_name, tables.table_rows FROM information_schema.tables LEFT JOIN (SELECT table_schema, table_name FROM information_schema.statistics GROUP BY table_schema, table_name, index_name HAVING Sum(CASE WHEN non_unique = 0 AND nullable != 'YES' THEN 1 ELSE 0 end) = Count(*)) puks ON tables.table_schema = puks.table_schema AND tables.table_name = puks.table_name WHERE puks.table_name IS NULL AND tables.table_schema NOT IN ( 'mysql', 'information_schema', 'performance_schema' , 'sys' ) AND tables.table_type = 'BASE TABLE' AND engine = 'InnoDB';
Using the script above, I counted how many tables do not have primary key, and found out that there are 64 tables.
Having a primary key should be the norm/best practice in terms of designing schema anyway.
We have to ensure all our tables have primary key. This will guarantee that all rows are unique, and it will make the SQL thread locate rows to delete (or update) easily.
If there is no way to logically add a natural primary key for the table, a potential solution is to add an auto-increment unsigned integer column as the primary key.
Cheers!
A SQL database will be marked with different states if one or more of its core files are in a inconsistent state. The type of state will depend on how serious the damage is. Here are some of the states:
Here a script to check the current state of all your databases in a SQL instance.
SELECT name, state_desc from sys.databases GO
There are several reasons why a database is marked as ‘Recovery Pending‘
Let’s Fix It
ALTER DATABASE TestDB SET EMERGENCY; GO
ALTER DATABASE TestDB set single_user GO
DBCC CHECKDB (OptimalTestDb, REPAIR_ALLOW_DATA_LOSS) WITH ALL_ERRORMSGS; GO
ALTER DATABASE OptimalTestDb set multi_user GO
Cheers!
SQL Server runs in different modes, and can be in a specific state at a given time.
The modes are:
In this blog post, we will talk about how to deal with a database that is in SUSPECT mode.
Here are some of the reasons why SQL Server marks a Database as Suspect:
How to Recover SQL Server from SUSPECT Mode
1.Turn off the suspect flag on the database and set it to EMERGENCY
EXEC sp_resetstatus 'YourDBName'; ALTER DATABASE YourDBName SET EMERGENCY
2. Perform a consistency check
DBCC CHECKDB YourDBName
3. Bring the database into the Single User mode and roll back the previous transactions
ALTER DATABASE YourDBName SET SINGLE_USER WITH ROLLBACK IMMEDIATE
4. Take a complete backup of the database
5. Attempt the Database Repair allowing some data loss
DBCC CHECKDB ('YourDBName', REPAIR_ALLOW_DATA_LOSS)
6. Bring the database into the Multi-User mode
ALTER DATABASE YourDBName SET MULTI_USER
7. Refresh the database server and verify the connectivity of the database
Cheers!
Concurrent operation has big implications for data integrity. In order to guarantee integrity, a relational database system must comply with the ACID principle. This is normally implemented through transactions.
In ACID, A stands for Atomicity, and that means when data is modified by some transaction, either all of its data modifications are performed or none of them are performed.
The C in ACID stands for Consistency. When a data modification transaction completes, all data must be in a consistent state. All constraints must be satisfied, and all internal structures must be correct.
I is all about Isolation. Modifications made by one transaction must be isolated from those made by other concurrent transactions. This also implies that if you redo the same operations with the same starting data, the results will always be the same.
The D in ACID stands for Durability. When a transaction is complete the results are stored permanently in the system and persist even if a system failure occurs. Now in order to get you set up, so that you can follow along with all the demos, let me introduce you to the tools I’ll be using.
Cheers!