Tag Archives: replication

Howto check MySQL replication consistency

If you want to be sure the data on the slave is the same as the data on the master ( yes sometimes is can happen to bbe diffeent ) you can use pt-table-checksum ( part of percona-toolkit ) to compute checksums for the data in the tables and then compare the checksums from the master with the ones on the slave(s).

Some options

Use with replication

pt-table-checksum can be used to compare any two databases/tables but if you want to compare everything on master and it's slaves you can use the
--replicate option to connect only on the master and compute checksums. The checksums will then be computed on slaves too by replicating the checksum statements.

Detecting slaves

If you have slave hosts running on non standard ports use option --recursion-method=hosts to tell pt-table-checksum how to detect the slaves. Tell the slaves to report their hostname and and port with report-host and report-port in my.cnf . This will make them show in "show slave hosts" issued on the master.

If the "hosts" method doesn't work, try with --recursion-method=dsn=t=dbname.dsns_table . Create a table with the name "dsns_table" and the following structure in the "dbname" database:

CREATE TABLE `dsns` (
 `id` int(11) NOT NULL AUTO_INCREMENT,
 `dsn` varchar(255) NOT NULL,
 PRIMARY KEY (`id`)
)

And put the dsns for accessing the slaves in the dsn field.
Example:

insert into dsns values ('','h=1.1.1.2,u=root,P=3306,p=slavepassword');

For non innodb plugin

--lock-wait-time is required if using a version of mysql without innodb plugin.

Database for storing checksums

pt-table-checksum stores the checksums in mysql so you have to create a database where to store the checksums table. I named mine "mk" since the tool was called mk-table-checksum before it became part of percona toolkit.

Use the --create-replicate-table option to create the checksums table if it doesn't already exist.

Example usage

pt-table-checksum --recursion-method=hosts --lock-wait-time=50 --defaults-file=/home/mihai/mysql.pass -u root --create-replicate-table --replicate-check --replicate mk.checksums 127.0.0.1

Example output

            TS ERRORS  DIFFS     ROWS  CHUNKS SKIPPED    TIME TABLE
09-13T18:25:13      0      0      361       1       0   0.019 mydb.accounts
09-13T18:25:13      0      0       91       1       0   0.079 mydb.announcements

How to restore mysql replication

Something went wrong and your mysql replication broke, I'm talking here about problems with the sql thread, not connection problems.

The sql thread shows you an error, what do you do to fix it and resume replication?

Here are 3 ways to fix it, each has advantages and disadvantages, pick the one that fits best to your problem.

1. Skip over the problem

You can try to just skip over the statement that broke the replication by changing the position in log file.

There are two ways to do this:

a) you can skip gradually

slave stop;
SET GLOBAL SQL_SLAVE_SKIP_COUNTER = 1;
slave start;
show slave status \G

That would skip the next 1 statement but you can set the counter higher to skip more the one.
Do it until the slave status shows the SQL thead is running.

b) skip to the current position

Use this is the first method keeps showing other statements that break replication and you don't have time to gradually skip statements.

First go on the master and type: show master status to find which is the current bin log file and the current position within the file.

Then go on the slave, stop it with "slave stop" and change the file name and position. Something like:

slave stop; 
CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.001958', MASTER_LOG_POS=52562937;
slave start;

But do that with your own file name and position taken from the master.

Check the replication status with "show slave status".
If the results are good ( both Slave_IO_Running and Slave_SQL_Running are Yes ) then you can go to the next step otherwise skip to next methods.

At this point you have a working replication but probably the data on the slave is not the same as on master since you skipped a few sql statements.

To fix it you can use maakit ( mk-table-checksum and mk-table-sync )

2. Full Dump and Restore

Connect to master, dump everything in a sql file, copy to replication slave and load it in mysql.

Use --master-data so the replication position is set in the dump file and the slave will know where to start.

Use --disable-keys so the slave will not try to build indexes after each insert and only built them at the end of the import.

Use --add-locks to surround each table dump with lock table/unlock table - this makes the inserts faster in the slave.

Problem:
--master-data will put a read lock on tables so operations on your master will lock waiting for the dump to finish. On large databases this can take a long time and it's unacceptable.

Possible fix:
If you have innodb tables add --single-transaction so a single global lock will be used only for a short time at the beginning of the transaction.

The problem is not so big if you can have filesystem snapshots on the master like the ones created by lvm.

3. Inconsistent Full Dump

This is just another fix for the problem at #1. Dump the data just like before but without using --master-data. This means no locks so the master can still work normally.
But because you don't use --master-data you will have to set the position in the slave yourself.
on the master type:

show master status \G

Take the file name and position and use them in the CHANGE MASTER statement on the slave ( after you load the dump file ) . Something like:

CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.001958', MASTER_LOG_POS=52562937;

Of course all of this will create an inconsistent slave but you can fix this easily with maakit.

If you know other methods I'd love to hear about them. Let me know in the comments.

managing mysql binary logs

Binary logs is how mysql keeps track of what changed in the databases. While it is recommended (in case you want to recover the database ) and sometimes even required ( if you want to replicate the db ) to keep such logs if you have a server where the database changes frequently those logs will occupy a lot of your disk space.

If you are tempted to just delete ( rm ) some of the old logs files DON'T do it. Or if you do it remember to also update the index file and remove the lines containing the log files you have deleted from it otherwise you will get in trouble, depending on your version mysqld server  might not start next time.

A better way then deleting them directly from the file system is to use the "purge logs" statement to delete all logs prior to a certain log file or prior to a certain date.The only problem with this is that you still have to remember to do this from time to time or set up a cron job to do it or else you will have to do it when mysql dies because it ran out of disk space. Luckily there is an even better solution.

There is a configuration option for mysql server that allows you to specify the number of days you want to keep logs for. Everything older then that number of days will be automatically deleted my the mysql server. The configuration variable is named: expire_logs_days. Something like expire_logs_days=30 will delete all log files older then 30 days

Warning! purge logs and expire_logs_days might not work if you deleted the bin logs files directly from the file system. To make them work  you will have check each line in the .index file. Each line in the .index file contains a bin log file name. If any file mentioned in the .index file doesn't exist on disk you will have to delete that line. Then just restart the mysql server.

One other tip to make the logs use less disk space is to tell the server not to record logs for databases where you don't care about loging ( like databases you only use for development or testing that might still get a lot of updates but you don't want to replicate them or you don't need to recover them if anything breaks ) .Here you can either tell mysql to only keep bin logs about some databases or to ignore others. The binlog-do-db and binlog-ignore-db configuration options will help you with this.

DRBD 8.0.1 released


This is a maintenance release of the 8.0 code that fixes a few bugs as the original accounce says:

Tara! The first maintenance release of the 8.0 code:

8.0.1 (api:86/proto:86)
--------
* Fixed some race conditions that could trigger an OOPS when the loca disk fails and DRBD detaches itself from the failing disk.
* Added a missing call to drbd_try_outdate_peer().
* LVM's LVs expose ambiguous queue settings. When a RAID-0 (md) PV is used the present a segment size of 64k but at the same time allow only 8 sectors. Fixed DRBD to deal with that fact corretly. Continue reading DRBD 8.0.1 released

mysql replication monitor

This is not a tutorial about how to set up mysql replication. You can find all the details about how to set up mysql replication in the official mysql documentation. This is just a script that can be used to monitor a MySQL replication setup. A MySQL replication setup consists of a master server and a slave server. On the slave server there are two threads that run continuously, one is the I/O thread that fetches changes that occurred on the master server and one is the SQL thread that tries to run the queries that were executed on the master server.

Continue reading mysql replication monitor

drbd 8.0 released

DRBD ( Distributed Replicating Block device ) is a Linux block device that is designed to mirror a whole block device over a network link. Today the team developing DRBD released version 8.0

Among many bug fixes and improvements in the new version we find support for primary/primary ( two way synchronization ) for distributed file systems such as OCFS2 and GFS, optional peer authentication with a shared secret and improved tunable after-split-brain recovery strategies.

Continue reading drbd 8.0 released