Source: http://hezmatt.org/~mpalmer/blog/2011/10/28/rsync-for-lvm-managed-block-devices.html If you’ve ever had to migrate a service to a new machine, you’ve probably found rsync to be a godsend. It’s ability to pre-sync most data while the service is still running, then perform the much quicker “sync the new changes” action after the service has been taken down is fantastic. For a long time, I’ve wanted a similar tool for block devices. I’ve managed ridiculous numbers of VMs in my time, almost all stored in LVM logical volumes, and migrating them between machines is a downtime hassle. You need to shutdown the VM, do a massive dd | netcat, and then bring the machine back up. For a large disk, even over a fast local network, this can be quite an extended period of downtime. The naive implementation of a tool that was capable of doing a block-device rsync would be to checksum the contents of the device, possibly in blocks, and transfer only the blocks that have changed. Unfortunately, as network speeds approach disk I/O speeds, this becomes a pointless operation. Scanning 200GB of data and checksumming it still takes a fair amount of time – in fact, it’s often nearly as quick to just send all the data as it is to checksum it and then send the differences.1 No, a different approach is needed for block devices. We need something that keeps track of the blocks on disk that have changed since our initial sync, so that we can just transfer those changed blocks. As it turns out, keeping track of changed blocks is exactly what LVM snapshots do. They actually keep a copy of what was in the blocks before it changed, but we’re not interested in that so much. No, what we want is the list of changed blocks, which is stored in a hash table on disk. All that was missing was a tool that read this hash table to get the list of blocks that had changed, then sent them over a network to another program that was listening for the changes and could write them into the right places on the destination. That tool now exists, and is called lvmsync. http://theshed.hezmatt.org/lvmsync It is a slightly crufty chunk of ruby that, when given a local LV and a remote machine and block device, reads the snapshot metadata and transfers the changed blocks over an SSH connection it sets up. Be warned: at present, it’s a pretty raw piece of code. It does nothing but the “send updated blocks over the network”, so you have to deal with the snapshot creation, initial sync, and so on. As time goes on, I’m hoping to polish it and turn it into something Very Awesome. “Patches Accepted”, as the saying goes. rsync avoids a full-disk checksum because it cheats and uses file metadata (the last-modified time, or mtime of a file) to choose which files can be ignored. No such metadata is available for block devices (in the general case).↩