As one of our hosts reached a 5-minute avarage load of 40 (with 2 cpus) , it’s time to look at the mysql log for slow queries again.
It’s not easy to find the cause for such incidents, as when the load goes up and things go wrong in the database, of course ALL the queries are starting to run slow.
A good site to start on this topic is always the Mysql Performance Blog . In the comments-section of this entry, I found links to two different scripts analyzing the slow-query log. (Also see the Mysql Performance Blog or the Mysql docs for how to turn on the slow query log in the first place )
MySQL Slow Query Log Filter for PHP5 and Python 2.4
One is MySQL Slow Query Log Filter for PHP5 and Python 2.4 , which goes through the logfile and gathers statistical data like counting the times a certain query occured. The first nice feature is to limit the time range of which queries are taken into account.
This command filters the log to show only queries on the 14th of september 2009, that took 30 secconds or more to execute.
$php mysql_filter_slow_log.php -T=30 --date=14.09.2009-15.09.2009 database-slow.log > output.log
The script , however, regards queries as different unless they match completley. That means that two queries on the same table, just looking up a different row by providing a different value for a key column are not counted as the same query.
For that reason, I changed the script a bit ( patch here ) to remove all strings inside quotes so that
select * from table where id = '1'
and
select * from table where id = '2'
end up being counted as the same query .
To use this functionality with the patch applied , start the script with the –preproc option.
MyProfi
The seccond script MyProfi already does “anonymize” the queries , it therefore provides a more summarized overview , with less options though. The output looks like this:
(...)
queries by pattern:
===================
1. 2 986 [ 9.65%] - select{} ....where id = {};
2. 2 750 [ 8.89%] - (...)
(...)
About Lock-Time
Another number I’d like to see would be the Overall Query Time minus the Lock Time , as this would show which queries take long without waiting for locks to be released. This would be the real “bad ones”, as far as I understand this.
The locking issues are probably worse here than they would have to be as all the site’s tables are MyIsam-Tables that use table-based locking (i.e. locking the whole table instead of the rows that get updated / inserted ).
Update: Maatkit mk-query-digest
Another tool, probably far more proffessional than the before mentioned : mk-query-digest.
It has a lot of options, but even with no special options set it provides very helpful output.
Some of the features I noticed from a quick test was that it does the “anonymizing” of the queries , the output features min, max, avg as well as stdv (standard deviation ) and median value.
There’s also a kind of graphical representation of query time distribution , meaning there are ascii-bars that show how many of a certain kind of query got executed within 0.5,1,10 etc. secconds.
I have not looked into the tool in greater detail (as the host runs smoothly again – for now ) , but it seems to be the best solution for analysing your query logs.