Binary logging is one of those options that is available to all, but in most cases is never turned on. Here is why I think it’s a good idea to turn it on.
When people hear “binary logging”, they think “replication”, and usually think “Not relevant to me”. However, binary logging allows:
Replication
Okay, yes, binary logging is critical for replication. Usually, most people don’t start out thinking they’ll need a replicated setup, so they don’t bother turning binary logging on.
The flip side to this, however, is that by the time they do need replicated slaves (usually to spread the query load) downtime isn’t really an option, and you really need downtime to take a consistent snapshot of the data to create replicated slaves. Also, at this point you may have a considerable amount of data in the database, which will make taking this snapshot a lengthy operation, further stretching your downtime.
You might not think you need a replicated slave now, but you may in the future, so turning this on now might save you time later.
Auditing
Half of the sales table has disappeared, and suddenly you’re the most popular person in the company – well, if you just count the number of calls you’re getting, rather than the fact that it’s mostly people are asking “When are you going to fix it?”
Nobody knows (or will admit to) anything. How can you tell what happened, in order to:
a) find out what went wrong,
b) stop it from happening again, and possibly,
c) slap someone for lying to you and making your job harder (if they’d just told you what happened, you wouldn’t need to do the first two and it’ll all be fixed quicker!)
Binary logs store all the data modification statements, so inspecting the binary log will tell you the time, MySQL user, IP address and statement that’s caused all of the excitement. It doesn’t store what was in a particular row or field, just what it’s being change to. The problem might be a flaw in a script or program (in which case you need to get the developers to fix or disable something) or it might be someone forgetting to put a condition on their DELETE or UPDATE statement – in either case, you don’t want to get the data back just for it to get eaten again.
Which leads nicely to…
Point in Time Recovery (PiTR)
Backups are never a bad idea, but the contents of databases tend to change extremely quickly, making a backup stale as soon as it’s finished. Even if you can restore the last 5 years of sales data when recovering from a data disaster, people will still complain about having to redo the last couple of hours work (or, worse still, will be unable to redo the last couple of hours work)
Once again, as binary logs record all data modification statements, it’s easy to re-run all the changes made between the time the backup was taken and any point in time since. Instead of losing an entire day’s work, you may be able to reduce the data loss to just a couple of minutes. You may still get people complaining about that (or even that there was downtime in the first place), but that’s all part of an Administrator’s job (System or Database)
The cost of binary logging
So, that’s the good things about binary logging, but very few things come for free, so what’re the costs?
First of all, binary logs take up disk space. Depending on how often the data is modified, this may become quite sizeable, so it’s worth monitoring. If you’ve turned binary logging on right from the start, you’ll also be able to determine the trend, allowing for capacity planning.
As the binary logs are stored on disk, they take up I/O bandwidth. In most cases, this isn’t a limiting factor for server performance, but some people like to cover this anyway. Putting binary logs on another disk (not just another partition) can reduce the amount of thrashing on a disk, having the disk on a separate controller can help reduce channel contention (although you’re still likely to have contention on whatever bus the controllers are on – you have pay the price eventually, although generally it’s not an issue)
The MySQL server will need to spend some (marginal) time and CPU to write the binary log. They’re fairly simple files (open one up with a hex editor and take a peek) and are simply appended to, so it’s not particularly taxing.
Pros vs. Cons – the final word
For most people, storage is cheap, so the extra space needed for binary logging isn’t significant. Some people may be interested in the auditing facilities, but it’s the PiTR ability that will interest most serious DBAs (we hate losing data) and will usually swing the balance firmly in favour of having binary logging turned on. If you’re okay with losing all database changes, both schemata and data, since your last backup (either because you back up very very frequently or because your data doesn’t change much) then it might not be so attractive.
My advice? Turn it on, unless you have a very good reason not to.