Friday, September 03, 2004
Mysql & Replication falling over
Mysql's replication feature is a great
way of providing redundancy and load sharing for database servers.
Unfortunately, as we've learnt from bitter experience, it doesn't recover
particularly well from system crashes (clean shutdowns are okay). My guess
is, since the bin log is being written almost constantly, corruption occurs.
Having spent a fair amount of time ploughing through documentation on how to
recover from this, I thought I'd share our quick? and easy step by step
solution.
There are two situations where this solution applies. Anything other than
these two and you're on your own.
Situation 1: Where you have a closed replication loop (A -> B -> C
-> D -> A) with writes to any number of the replicated databases you can
recover provided that replication only fails at
one point in the
loop. Typically this will be the database immediately after the host that
crashed (so if B crashes, C will stop replicating). The reason this works
is that changes still propogate back to the crashed machine by virtue of all
other replication still working.
Situatuon 2: Where you have a replication change (A -> B -> C) with
read
only access at the point of failure and below.
Okay, so now what you've got to do is go to the database host immediately
before the one that failed (ie the last one still working) and then do the
following:
RESET MASTER;
Then go to the slave that's failed and do this:
SLAVE STOP;
RESET SLAVE;
LOAD DATA FROM MASTER;
SLAVE START;
You can check replication is function again using the
SHOW SLAVE STATUS;
command as per normal.
You might find that you need to sequentially do this on each database below
the first failed one. In the case of a loop, you might have to go all the
way around the loop back to the first one that you RESET MASTER
on.
posted by guy at: 08:54 SAST |
path: /systems |
permanent link
