2008年4月30日 星期三

Disaster Handling in MySQL

MySQL Conference Liveblogging: Disaster Is Inevitable - Are You Prepared? (Tuesday 4:25PM) | beer planet
# Suicide

* having no backups
* depending on slaves for backup
* keeping backups on same SAN
* having a single DBA - Frank didn't like this one at all
* not keeping binlogs

# Restoring from backup

* how much time?
* uncompressed backup ready to mount?
* separate network for recovery?

# In Fotolog, 1TB of data was severely hit.

* first problem: backup was highly compressed (tar.gz)
* uncompressing took hours
* so keep uncompressed backups (at least last N days)
* it should be mountable, rather than transferable

# Frank going over recovery modes at http://dev.mysql.com/doc/refman/5.0/en/forcing-recovery.html
# Row by row recovery

* row by row recovery (get the range of ids)
* custom scripts
* may not be able to use primary key
* foreign key based retrieval faster
* lose 4 seconds for each crashed record (in Fotolog, for some reason some values were crashing mysqld)

# Lessons

* SANs make sense (in some environments)
* try to replicate the whole SAN (in Fotolog, a SAN actually failed because of a bug in its maintenance program)
* everything will fail at some point
* backup everything (cron jobs, my.cnf, custom scripts)
* have backup in a form ready to restore
* don't count replication a backup
* be worried about 'routine' operations

# Peter Zaitsev of Percona takes the stage to talk about his homegrown tools for InnoDB recovery

* innodb-tools - will recover even if mysqld doesn't start, for example if half of RAID0 fails or somebody deleted some data. innodb-tools will recover using InnoDB tablespaces.

沒有留言: