Starting to improve performance

Improving the performance of a system using the MySQL database server is not just about indexes and buffers – getting the most out of the system requires a wider view. We’ll start by looking at what needs to be considered when a system starts needing work to maintain or improve performance.

To get the best performance out of a system, there are three things that need to be considered: (usually in this order)

Query optimisation
Schema optimisation
Server optimisation

Most people are familiar with query optimisation, which (incorrectly) is sometimes assumed to be the start and end of the optimisation process. Query optimisation looks at the queries being run, and whether they can be changed to improve performance, usually by rewriting parts of the query. One big killer of database performance is the use of functions in the WHERE clause of the statement – in some cases these are unnecessary and can stop the query optimiser from being able to use indexes. Where possible, make these computations in the client code rather than make the database server do the work – in many cases, it’s easier and less disruptive to upgrade the client system rather than the database server (it’s easier to add an additional web server than to upgrade the database server hardware, and there’s no downtime involved). The use of the slow query log can indicate which queries may benefit from scrutiny, but in some cases a slow log entry is the effect and not the cause (the query ran slow because the system was running slow), and the query may also be optimised but is running slow due to schema issues.

Schema optimisation looks at how the tables, fields and indexes are laid out. In some cases performance can be increased by creating a new index (although this is not always the case) but other times it may require more invasive schema changes like splitting/creating fields or tables. In these situations the DBA needs to know how people use the data in the database to know best how to restructure it – there is no ‘one size fits all’ solution to schema problems. Schemas usually grow organically over time, in response to needs, and it’s easy to create short-term fixes that may not scale up particularly well, and it’s important that people recognise this fact. Restructuring a schema is not an indication of failing to plan properly – it’s an opportunity to correct initial (and well intentioned) assumptions about how the database was going to be used.

Always use separate date and time columns – it’s easy enough to join them together if necessary, and you’ll find that in a lot of cases people are only interested in results between certain dates (from midnight to midnight) rather than between certain dates and times. An index on the date column should provide good cardinality and is very likely to be used when doing temporal searches.

Server optimisation goes beyond the data, and looks at the system itself. It’s this kind of optimisation that looks at buffers and caches, and the disk/memory/CPU usage of the hardware the database server is running on. This level of optimisation requires in-depth knowledge of both the usage and capabilities of the database server, and would be quite rare – generally, once a server has been set up and initially tuned there is little more to do. Server optimisation may require the help of an experienced system administrator to help with the tuning of hardware and operating system (particularly in the case of the disk I/O systems) to provide as much capacity as possible to the database server. The DBA’s job is to look at buffer and cache usage to ensure they’re set to useful levels (too much can be just as bad as too little – adding more memory allocation to a buffer or cache that already has free space will not make it work any better) and potentially create new key buffers to reduce buffer contention. It may also be involve inspecting and changing the disk usage – separating schemas or even individual tables to separate disks (potentially different storage controllers for very high performance) to reduce disk I/O bottlenecks.

Always push for the best server hardware you can for the database server, even if it means having to run additional services (like a web server) on it. In time, these additional services can be migrated off of your database server, leaving you with (hopefully) a pretty powerful server dedicated to the database. Most other services can be migrated with little or no downtime, whereas moving the database server onto a new platform will always result in downtime (unless you have a pretty special setup). In most cases, you’ll find that performance issues can be effectively handled with replicated slaves without too many code changes, allowing you to ‘scale-out’ rather be stuck with ‘scale-up’.

One thing a DBA should not be afraid of is to tell people where they’re going wrong, and how to improve things. Initially there may be resistance to ‘interference’ (no-one deliberately designs things sub-optimally, and people may get defensive when they feel challenged regarding their design decisions) so a good DBA will need effective person management skills as well as technical skills. Make it clear that you’re working with people, for their own benefit, and try to educate rather than command that changes are made. People naturally like to show off what they know, so by educating your users to improve the performance of their work they are likely to educate each other regarding performance issues. A well educated development team can make a DBA’s life much easier!