One of our slave databases crashed few days ago due to some disk issues.
After everything is back to operational - I noticed that the replication has stopped in this slave. I tried to restart the replication but it was unsuccessful.
I then checked the MySQL log and found the following:
1 2 3 4 5
And so MySQL is complaining that it can’t find the relay log file and sure enough I had a look into the directory, I couldn’t find it.
So what to do? I Googled around for some magic solutions - this link came up on top. The proposed solution was to:
CHANGE MASTER TO the positon _on the master_ where the slave was, the slave will get the transactions from the master binary logs again and you will lose nothing. Requires that the master still has those binary logs.
However this is not the problem, it is not that the slave doesn’t know the master bin log position. My guess is the relay log is now outdated and needs to be regenerated.
So what we really need to do is to tell the slave to regenerate its relay log files.
Here are the steps to do it:
- On the crashed slave - record the last Master bin log file and position before the crash
- The slave should not be running - if it is - issue STOP SLAVE
- And then issue RESET SLAVE command
- Go to the MySQL directory - ensure that relay logs, master info and relay-log.info are deleted - they should be deleted when you run RESET SLAVE. Note: When I did this the first time - MySQL didn’t delete the relay logs - running RESET SLAVE the second time fixed it
- Now issue CHANGE_MASTER log position to the position that you have recorded on step 1 above
- Issue START SLAVE
Now the replication should start again - you will notice that the relay log files are re-numbered from the beginning ie: suffix 00001.